RAMA: Easy Access to a High-Bandwidth Massively Parallel File System

نویسندگان

  • Ethan L. Miller
  • Randy H. Katz
چکیده

Massively parallel file systems must provide high bandwidth file access to programs running on their machines. Most accomplish this goal by striping files across arrays of disks attached to a few specialized I/O nodes in the massively parallel processor (MPP). This arrangement requires programmers to give the file system many hints on how their data is to be laid out on disk if they want to achieve good performance. Additionally, the custom interface makes massively parallel file systems hard for programmers to use and difficult to seamlessly integrate into an environment with workstations and tertiary storage. The RAMA file system addresses these problems by providing a massively parallel file system that does not need user hints to provide good performance. RAMA takes advantage of the recent decrease in physical disk size by assuming that each processor in an MPP has one or more disks attached to it. Hashing is then used to pseudo-randomly distribute data to all of these disks, insuring high bandwidth regardless of access pattern. Since MPP programs often have many nodes accessing a single file in parallel, the file system must allow access to different parts of the file without relying on a particular node. In RAMA, a file request involves only two nodes — the node making the request and the node on whose disk the data is stored. Thus, RAMA scales well to hundreds of processors. Since RAMA needs no layout hints from applications, it fits well into systems where users cannot (or will not) provide such hints. Fortunately, this flexibility does not cause a large loss of performance. RAMA’s simulated performance is within 10-15% of the optimum performance of a similarly-sized striped file system, and is a factor of 4 or more better than a striped file system with poorly laid out data. 1995 USENIX Technical Conference January 16-20, 1995 Ne

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RAMA: An Easy-to-Use, High-Performance Parallel File System

Modern massively parallel file systems provide high bandwidth file access by striping files across arrays of disks attached to a few specialized I/O nodes. However, these file systems are hard to use and difficult to integrate with workstations and tertiary storage. RAMA addresses these problems by providing a high-performance massively parallel file system with a simple interface. RAMA uses ha...

متن کامل

RAMA: a file system for massively-parallel computers

This paper describes a file system design for massively parallel computers which makes very efficient use of a few disks per processor. This overcomes the traditional I/O bottleneck of massively parallel machines by storing the data on disks within the high-speed interconnection network. In addition, the file system, called RAMA, requires little inter-node synchronization, removing another comm...

متن کامل

Storage Hierarchy Management for Scientific Computing

Scientific computation has always been one of the driving forces behind the design of computer systems. As a result, many advances in CPU architecture were first developed for high-speed supercomputer systems, keeping them among the fastest computers in the world. However, little research has been done in storing the vast quantities of data that scientists manipulate on these powerful computers...

متن کامل

Time-Deterministic WDM Star Network for Massively Parallel Computing in Radar Systems

In massively parallel computer systems for embedded real-time applications there are normally very high bandwidth demands on the interconnection network. Other important properties are time-deterministic latency and services to guarantee that deadlines are met. In this paper we analyze how these properties vary with the design parameters for a passive optical star network, specifically when use...

متن کامل

NFS-cc: tuning NFS for concurrent read sharing

A common file access pattern found in cluster applications is concurrent read sharing: applications running on multiple sites read access the same dataset concurrently. Traditional network file systems are limited by the server’s network bandwidth; therefore cannot satisfy the high-bandwidth concurrent reads that cluster applications typically require. This paper presents NFS-cc: a cooperative ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995